Klokov R, Lempitsky V. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 863-872.
1. Overview
1.1. Motivation
- rasterize 3D models onto uniform voxel grids lead to large memory footprint and slow process time
- there exist a large number of indexing strucutres (kd-tree, oc-trees, binary spatial partition tree, R-trees and constructive solid geometry)
In this paper, it proposed Kd-Networks
- divide the point cloud to construct the kd-tree
- perform multiplicative transformation and share parameters of these transformation (mimic ConvNet)
- not rely on grids and avoid poor scaling behavior
1.2. Related Works
- 3D Conv (+ GAN)
- 2D Conv (2D projection of 3D obj)
- spectral Conv
- PointNet
- RNN
- OctNet (Oct-Trees)
- Graph-based ConvNet
1.3. Dataset
- classification. ModelNet10, ModelNet40
- shape retrieval. SHREC’16
- shape part segmentation. ShapeNet part dataset
2. Network
2.1. Input
Recur to divide the point clouds into two equally-sized subsets. get N - 1 nodes and each divide direction d_i (along x, y or z).
- N. the fixed size of point cloud (sub-sample or over-sample)
- d_i. divide direction of each level
- l_i. the level of tree
- c_1(i) = 2i, c_2(i) = 2i + 1. children of ith node
2.2. Processing Data with Kd-Net
Given a kd-tree, compute the representation v_i of each node. In the ith level, apply the sharing layer to the same divide direction node.
- v_i. the representation of ith node
- φ. Relu
- []. concate
- W, b. parameters of the layer in ith level, d_i direction (dimension: 2m_{l+1} x m_l, m_l)
2.3. Classification
2.4. Shape Retrieval
- output a descriptor vector (remove trained classifier of Classification)
- histogram loss. also can use Siamese loss or triplet loss
2.5. Segmentation
- mimic encoder-decoder (Hourglass)
- skip connection
2.6. Properties
- Layerwise Parameter Sharing
- CNN. share kernels for each localized multiplication
- Kd-Net. share kernel (1x1) for points with same split direction in same level
- Hierarchical Representation
- Partial Invariance to Jitter
- split direction
- Non-invariance to Rotation
- Role of kd-tree Structure
- Kd-tree determine the the combination order of leaf representation
- Kd-tree can be regarded as a shape descriptor
2.7. Details
- normalize 3D coordinates. [-1, 1]^3 and put the origin at centroid
- data augmentation. perturbing geometric transformation, inject randomness into kd-tree construction (direction probability)
- γ=10.
3. Experiments
3.1. Details
- MNIST→2D Point Cloud. point of the pixel center
- 3D Point Cloud. sample faces→ sample point from face
- Self-ensemble in test time
- Augmentation
- TR. translation long axis ±0.1
- AS. anisotropic rescaling
- DT. deterministic tree
- RT. randomized tree
3.2. Classification
3.3. Ablation
3.4. Shape Retrieval
- 20 rotations→pooling→FC
3.5. Part Segmentation
- duplicated random sample with an addition of a small noise. help with rare calss
- during test, predict on upsampled cloud and then obtain the mapping of original points
- low memory footprint < 120 MB